A Revisiting “Forward node-selecting queries over trees”
نویسنده
چکیده
XML is a World Wide Web Consortium (W3C) standard for tree-structured data. XPath [Clark and DeRose 1999] is an important language widely employed in XML query, transformation, and update languages. XPath is a language of path expressions that can be viewed as defining sets of nodes of a tree, by following axis steps and applying node tests or path-existence filters to navigate from the root of the tree. For example, the XPath expression /descendant::A[child::B] selects all nodes in the tree below the root whose label is A and that have a child labeled B; here, descendant and child are axis steps, and the brackets indicate a filter that tests for the existence of a path matching the expression inside. A significant complication in XPath is the presence of both forward and reverse axis steps. If implemented naively, by for example repeatedly traversing the tree in forward or backward directions, queries that mix forward and reverse edges can be very expensive to evaluate. For example, an XPath query
منابع مشابه
Efficient Processing of Expressive Node-Selecting Queries on XML Data in Secondary Storage: A Tree Automata-based Approach
We propose a new, highly scalable and efficient technique for evaluating node-selecting queries on XML trees which is based on recent advances in the theory of tree automata. Our query processing techniques require only two linear passes over the XML data on disk, and their main memory requirements are in principle independent of the size of the data. The overall running time is O(m + n), where...
متن کاملLearning Node Selecting Tree Transducer from Completely Annotated Examples
A base problem in Web information extraction is to find appropriate queries for informative nodes in trees. We propose to learn queries for nodes in trees automatically from examples. We introduce node selecting tree transducer (NSTT) and show how to induce deterministic NSTTs in polynomial time from completely annotated examples. We have implemented learning algorithms for NSTTs, started apply...
متن کاملSchema-Guided Induction of Monadic Queries
The induction of monadic node selecting queries from partially annotated XML-trees is a key task in Web information extraction. We show how to integrate schema guidance into an RPNI-based learning algorithm, in which monadic queries are represented by pruning node selecting tree transducers. We present experimental results on schema guidance by the DTD of HTML.
متن کاملLearning Monadic Queries for Semi-Structured Documents from Positive Examples
Querying for nodes in trees is a core operation for information extraction from semi-structured documents in XML or HTML. We show that regular monadic queries for nodes in trees can be identified from positive examples, and this in polynomial time when represented by deterministic node selecting transducers that we introduce.
متن کاملLearning n-Ary Node Selecting Tree Transducers from Completely Annotated Examples
We present the first algorithm for learning n-ary node selection queries in trees from completely annotated examples by methods of grammatical inference. We propose to represent n-ary queries by deterministic n-ary node selecting tree transducers (n-NSTTs). These are tree automata that capture the class of monadic second-order definable nary queries. We show that n-NSTT defined polynomially bou...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013